Systems for creating and/or maintaining databases and a system for facilitating online advertising with improved privacy

ABSTRACT

A system for creating and/or maintaining a database is disclosed. In one example, the system includes one or more processors; a classification module configured to determine primary weights for primary data streams, each primary weight referring to a correlation between one of the primary data streams and one segment category of several predefined segment categories; a recognition module configured to identify explicit concepts and implicit concepts in the primary data streams, and to determine first secondary weights characterizing embeddings of the identified concepts; an expansion module configured to determine for the identified concepts respective related concepts; and a storage module configured to save the identified concepts.

TECHNICAL FIELD

Embodiments of the present invention relate to systems for creatingand/or maintaining databases, in particular server based systems forcreating and/or maintaining databases, a system for facilitating onlineadvertising with improved privacy in a network, and related methods.

BACKGROUND

The World Wide Web (WWW) has become one of the most important sources ofinformation and is widely used as media channel for advertising.Processing and providing of data in the WWW) is a dynamic process. Thisalso applies to digital (media) advertising often used to generaterevenue for web site and web service providers. Typically, digitaladvertising or online advertising is provided in form of so-called ads(advertisements, ad for short), in particular as visual and/or audioads, which are delivered dynamically and almost in real time on awebpage in the browser or in other kind of digital user interfaces likemobile apps or videos. By this it is possible that the specific ad shownembedded or next to a piece of content changes in time or differsdepending on the user consuming the content.

Many technologies are available to target the digital user. This means,a system is selecting the ad to be displayed dependent on the individualuser. Accordingly, a greater flexibility of advertisers and/or bettertailored advertising campaigns may be achieved. For this the systemneeds to record, aggregate and store some kind of personal data or atleast anonymous profiles based on personal data. For example, cookiesmay be used for user tracking and gathering information about the users,respectively.

However, there is a growing concern about the down sides of collectinghuge amounts of personal data in the Web. Among other things,calculating user behavior based on personal data collection canfacilitate their manipulation. Furthermore, data collecting, dataprocessing and advertising are resource intensive with often onlymoderate success even if behavioral targeting is used.

Currently, there are increasing efforts to regulate collecting personaldata. For example, in the EU the general data protection regulation(GDPR) got recently into effect. Accordingly, date collecting, storingand processing of any kind of personal data is subject to severerestrictions, needs to meet many prerequisites and any objection tothese guidelines is subject to high penalties. GDPR applies to the dataof any citizen of the EU, to any system that is located on the terrainof the EU and even to any person entering the territorial region of theEU. That means GDPR is even of global reach. With GDPR the use oftargeting technologies that make use of personal data pose a high riskon any player in the market, the technology providers as dataprocessors, the publishers as well as agencies and the advertisersthemselves. Currently, the European e-Privacy regulation is in the finalprocess of legislation at EU level. It will likely even more restrictthe use of personal data in online advertising. E-Privacy is aiming atgiving the user full control of the way his personal data is used andwho it is shared with. For example, e-Privacy will disallow the storingof cookies unless given the explicit and informed consent by the user.

Accordingly, there is a need to further improve data collecting, dataprocessing and/or digital advertising in a network, in particular in theWWW.

BRIEF DESCRIPTION OF THE DRAWINGS

The components in the figures are not necessarily to scale, insteademphasis being placed upon illustrating the principles of the invention.Moreover, in the figures, like reference numerals designatecorresponding parts. In the drawings:

FIG. 1 is a block diagram schematically illustrating a system forcreating, updating and/or maintaining a database and processes of amethod for creating and/or maintaining the database according toembodiments;

FIG. 2 is a block diagram schematically illustrating a system formaintaining a database and processes of a method for maintaining thedatabase according to embodiments;

FIG. 3 schematically illustrates processes of a method for maintainingthe database according to embodiments;

FIG. 4 is a block diagram schematically illustrating a system forcreating and/or maintaining a database and processes of a method forcreating and/or maintaining the database according to embodiments;

FIG. 5 is a block diagram schematically illustrating a system forfacilitating online advertising with improved privacy and processes of amethod for facilitating online advertising with improved privacyaccording to embodiments;

FIG. 6 is a block diagram schematically illustrating a system forfacilitating online advertising with improved privacy;

FIG. 7A illustrates a flow chart of a method for creating, updatingand/or maintaining a database according to an embodiment; and

FIG. 7B illustrates a flow chart of a method for facilitating onlineadvertising with improved privacy in a network to an embodiment.

DETAILED DESCRIPTION

According to an embodiment of a system for creating and/or maintaining afirst database, the system includes one or more processors, aclassification module that is, when executed by at least one of the oneor more processors, configured to determine primary weights for primarydata streams, each primary weight referring to a correlation between oneof the primary data streams and one segment category of severalpredefined segment categories, a recognition module that is, whenexecuted by at least one of the one or more processors, configured toidentify explicit concepts and implicit concepts in the primary datastreams, and to determine first secondary weights characterizingembeddings of the identified concepts in the respective main segmentcategory with highest primary weight using a concept database storingweights for concepts within the predefined segment categories, anexpansion module that is, when executed by at least one of the one ormore processors, configured to determine for the identified (explicitand implicit) concepts respective related concepts and second secondaryweights of the related concepts characterizing embeddings of the relatedconcepts in at least one of the predefined segment categories; and astorage module that is, when executed by at least one of the one or moreprocessors, configured to save the identified concepts, the firstsecondary weights, the related concepts, and the second secondaryweights in the first database.

The primary data streams may refer to webpages provided in the networksuch as the WWW, and/or may represent a respective content of thewebpages, in particular a respective text of websites to be displayed ina web application such as a web browser or in another applicationsoftware (app), in particular a mobile app running on a client such as aclient computer, for example a mobile client device such as a tablet orsmartphone.

According to an embodiment of a computer-implemented method for creatingand/or maintaining a first database, the method includes determiningprimary weights for primary data streams. Each primary weight refers toa correlation between one of the primary data streams and one segmentcategory of several predefined segment categories. Concepts areidentified in the primary data streams. This may include identifyingexplicit concepts and identifying implicit concepts. First secondaryweights characterizing embeddings of the identified concepts in therespective main segment category with highest primary weight aredetermined using a concept database storing weights for concepts withinthe predefined segment categories. For the identified (explicit andimplicit) concepts respective related concepts and second secondaryweights of the related concepts characterizing embeddings of the relatedconcepts in at least one of the predefined segment categories aredetermined.

The identified concepts, the first secondary weights, the relatedconcepts, and the second secondary weights may be stored in the firstdatabase. The method typically includes identifying in the primary datastreams a rating including at least one of a sentiment, and an emotion.The identified rating may also be stored in the first database.

According to an embodiment of a system for creating and/or maintaining asecond database, the system includes one or more processors, aclassification module that is, when executed by at least one of the oneor more processors, configured to determine primary weights for primarydata streams, each primary weight referring to a correlation between oneof the primary data streams and one segment category of severalpredefined segment categories; a concept learning module that is, whenexecuted by at least one of the one or more processors, configured todetermine for known concepts which comprise a respective term found inthe primary data streams embedding terms for the respective term andweights characterizing the embeddings of the embedding terms in therespective segment categories; and a storage module configured to updatethe known concepts stored in the second database in accordance with theembedding terms and weights characterizing the embeddings of theembedding terms.

According to an embodiment of a computer-implemented method for creatingand/or maintaining a second database, the method includes determiningprimary weights for primary data streams. Each primary weight refers toa correlation between one of the primary data streams and one segmentcategory of several predefined segment categories. For known conceptswhich comprise a respective term found in the primary data streamsembedding terms and weights characterizing the embeddings of theembedding terms in the respective segment categories are determined. Theknown concepts stored in the second database may be updated inaccordance with the embedding terms and the weights characterizing theembeddings of the embedding terms.

According to an embodiment of a system for creating and/or maintaining athird database, the system comprises one or more processors; a semanticanalysis module that is, when executed by at least one of the one ormore processors, configured to determine primary weighted semanticmetadata for a content, in particular a target content or a content ofan advertising campaign, using a database storing weights for conceptswithin predefined segment categories; a semantic expansion module, thatis, when executed by at least one of the one or more processors,configured to determine secondary weighted semantic metadata for thecontent using a further database storing first concepts, first weightsfor the first concepts in predefined segment categories, second conceptsthat are related to the first concepts, and the second weights for thesecond concepts in the predefined segment categories. The semanticexpansion module may be configured to identify a rating of the content.The rating includes at least one of a sentiment, and an emotion. Thesystem may further comprise a storage module that is, when executed byat least one of the one or more processors, configured to store therespective weighted semantic metadata and optionally the rating in thethird database. The third database may be a target database or acampaign database.

The target content may refer to webpages provided in the network such asthe WWW, and/or may represent a respective content of the webpages, inparticular a respective text of the websites to be displayed in a webapplication such as a web browser or in another application software(app), in particular a mobile app running on a client such as a clientcomputer, for example a mobile client device such as a tablet orsmartphone.

According to an embodiment of a computer-implemented method fordetermining semantic metadata for a content, in particular a targetcontent or a content of an advertising campaign, the method includesdetermining primary weighted semantic metadata for the content using adatabase storing weights for concepts within predefined segmentcategories, and determining secondary weighted semantic metadata for thecontent using a further database storing first concepts, first weightsfor the first concepts in predefined segment categories, second conceptsthat are related to the first concepts, and the second weights for thesecond concepts in the predefined segment categories. The method mayinclude identifying a rating of the content. The rating includes atleast one of a sentiment, and an emotion. The method may further includestoring the respective weighted semantic metadata in a database forcontent metadata, and optionally storing the rating in the database forcontent metadata. The database for content metadata may be a targetdatabase or a campaign database.

According to an embodiment of a system for facilitating onlineadvertising in a network with improved privacy, the system includes oneor more processors, a matching module that is, when executed by at leastone of the one or more processors, configured to use weighted semantictarget metadata for a target content which is provided in the networkand weighted semantic campaign metadata for an advertising campaignwhich is to be presented in the network and refers to a respectiveproduct and/or a service to determine a matching parameter between thetarget content and the advertising campaign, and a management modulethat is, when executed by at least one of the one or more processors,configured to use the matching parameter for deciding if the advertisingcampaign is to be provided to the target content.

The system for facilitating online advertising is typically configuredto initiate sending one or more advertising campaigns to client(s) atleast substantially based on the matching parameter(s), or at leastsubstantially based on the matching parameter(s) and identified ratings.Initiating sending the advertising campaign may be achieved withoutusing any tracking data of registered users of the client. Accordingly,online advertizing may be achieved in a privacy compliant manner.

Since no personal tracking data are required, less data have to betransferred through the network for advertising. Furthermore, less dataspace and/or less computational power may be required by the systemand/or in the network. Accordingly online advertising may be achievedwith lower energy consumption and more environmentally friendly,respectively. Even further, the runtime of mobile client devices may beincreased as collecting and sending of personal tracking data can beavoided.

As the selection of the advertising campaigns is based on weightedcontent criteria (concept matching between advertising campaigns and oftarget such as websites), the placement of the advertising campaigns maybe achieved with higher target accuracy. This is facilitated by takinginto account implicit concept and/or sentiments. Note that misplacementsof advertising campaigns may be at least substantially reduced. All thismay also contribute to more environmentally friendly advertising in thenetwork and even lead to an increased acceptance of advertising by theusers as well as increased revenue per advertising campaign forservice/content providers in the network.

As the selection of the advertising campaigns is considering topicalsurroundings of any kind (including quotations of topics in connectionwith certain places, times or events or other topics in general), theadvertising campaign may be able to appeal to latent interests of theuser and in this way address target groups that are not reached byuser-based targeting.

As the placements of the advertising campaigns may be optimized forhighest effectivity and more meaningful relation to what the user isreading, viewing or hearing, the number of placed ads can be reduced andtherefore publisher web-sites will be less cluttered with ads and usersdo not need to make use of ad-blocking extensions in their web browsers.

The system for facilitating online advertising may include at leastportions of the system for creating and/or maintaining the firstdatabase, the system for creating and/or maintaining the second databaseand/or the system for creating and/or maintaining the third database asrespective sub-systems.

The system for facilitating online advertising typically includes oneserver connectable to the network via a respective interface or several(interconnectable) servers, for example one server for each sub-system.

According to an embodiment of a computer-implemented method forfacilitating online advertising in a network with improved privacy, themethod includes using weighted semantic target metadata for a targetcontent which is provided in the network, and weighted semantic campaignmetadata for an advertising campaign which is to be presented in thenetwork and refers to a respective product and/or a service to determinea matching parameter between the target content and the advertisingcampaign. The matching parameter is used for deciding if the advertisingcampaign is to be provided to the target content.

The respective weighted semantic metadata may be retrieved fromrespective (third) databases (target database, campaign database),and/or determined by the methods explained herein, in particular themethod for determining semantic metadata for a content.

Other embodiments include corresponding computer-readable storage mediaor devices, and computer programs recorded on one or morecomputer-readable storage media or computer storage devices, eachconfigured to perform the processes of the methods described herein.

The computer program product and/or a computer-readable storage mediummay include instructions which, when executed by a one or moreprocessors of a system, in particular data processing system (alsoreferred to as information processing system) connectable to network,cause the system to carry out the processes of the methods explainedherein.

The system of and/or including one or more computers and/or processorscan be configured to perform particular operations or processes byvirtue of software, firmware, hardware, or any combination thereofinstalled on the one or more computers and/or processors that inoperation may cause a data processing system to perform the processes.

Those skilled in the art will recognize additional features andadvantages upon reading the following detailed description, and uponviewing the accompanying drawings.

In the following Detailed Description, reference is made to theaccompanying drawings, which form a part hereof, and in which is shownby way of illustration specific embodiments in which the invention maybe practiced. In this regard, directional terminology, such as “top,”“bottom,” “front,” “back,” “leading,” “trailing,” etc., is used withreference to the orientation of the Figure(s) being described. Becausecomponents of embodiments can be positioned in a number of differentorientations, the directional terminology is used for purposes ofillustration and is in no way limiting. It is to be understood thatother embodiments may be utilized and structural or logical changes maybe made without departing from the scope of the present invention. Thefollowing detailed description, therefore, is not to be taken in alimiting sense, and the scope of the present invention is defined by theappended claims.

FIG. 1 illustrates in a block diagram a system 100 and processes forcreating, updating and/or maintaining a first database 70. In theexemplary embodiment, the first database 70 stores the data, which referto respective content 10 such as the content of websites presented viaand/or in a network, in a semantic knowledge graph. In the following,the semantic knowledge graph is also referred to as SKG. The semanticknowledge graph structures are particularly well suited for retrievingthe stored data/information {i, W_(ij), c_(ik), w_(ik), r_(ikm),w_(ikm), s_(i), e_(i)}. Note that brackets { } indicate a collection ofvalues and/or data which may have indices (subscripts), in particulardata sets.

The system 100 may be implemented as or include a server 100 connectedto the network and having access to and/or hosting the first database 70and a second database 60 that may be used for determining the data to bestored in the first database 70. In the following, the first database 70and the second database 60 are also referred to as first database 70 andconcept database 60, respectively. Creating, updating and/or maintainingthe second database 60 is explained below with regard to FIG. 2.

The system and server 100, respectively, may be a sub-system of thesystems 500, 500′ explained below with regard to FIGS. 5 and 6.

The server 100 may be used for pre-emptive crawling the network andknowledge graph generation.

For sake of clarity, only modules (or components) that may beimplemented in software and executed by one or more processors of theserver 100 as well as flow of data/information (arrows) are shown,whereas the hardware components such as the processor(s) and storagemedia are omitted. This applies also to the following figures.

Improved advertising typically includes anticipating probableassociations and emotions a user might have when consuming (reading orviewing) certain content in the internet (WWW). This allows fordisplaying the user commercial ads that most likely have a positiveimpact on the user—without even knowing who he is. Likewise, presentingthe user ads of a brand that put the brand and/or its commercial messagein a negative context may be avoided.

To achieve that, knowledge about the relevance and sentiment about anygiven topic in various predefined segment categories {S_(j)}, wherein jis a positive integer, is to be systematically and continuously built upand typically stored as SKG in the first database 70.

The predefined segment categories {S_(j)} correspond to subject domainsor (generic) topics. Segment categories may but must not correspond tocontent categories on web-sites like “finance”, “travel”, “personalhealth” etc. Segment categories can be organized hierarchically in ataxonomy. This includes lower-tier categories like “Disasters” in “Newsand Politis”, “Insurance” in “Personal Finance” or “Celebrity Pregnancy”in “Pop culture”. Segment categories can systematically be extended by Nsub-level-segments (lower-level tiers) for a higher precision of thesystem.

For generating the SKG, a web-crawler (not shown) may search through theweb to generate primary data streams referring to web-pages 10. In thefollowing the primary data streams are also referred to as primary data.

The web-crawler may extract respective primary content D_(i) from i URLs(Uniform Resource Locator, web address), wherein i is a typically largepositive integer. For example, the number i of extract respectiveprimary content D_(i) (e.g. from i webpages) may be larger than 10⁴,more typically larger than 10⁶, and even more typically larger than 10⁸.The primary content D_(i) may be defined by the body content of a webpage, its title and various other meta-data like author, time ofpublishing etc. The URLs along with the primary content D_(i) may bestored in a content index database 50. The primary content is alsoreferred to as target content herein.

Each primary content D_(i) may be processed by the modules of theexemplary server 100.

First, a content classification module 110 may be used to determine forprimary content D_(i) and thus for the primary data primary weights{W_(ij)} representing a respective correlation between the content ofthe primary data D_(i) and one segment category S_(j) of the predefinedsegment categories {S_(j)}.

Furthermore, the segment category with highest primary weight(Max{w_(ij)},j)) may be determined as main segment category of therespective primary data D_(i).

All this is typically achieved using a trained neural network, inparticular a trained CNN. In so doing, the primary content D_(i) may beenriched by a data-set (metadata) assigning the content belonging toseveral of predefined segment categories {S_(j)} with specific weights{W_(ij)}.

The term “neural network” (NN) as used in this specification intends todescribe an artificial neural network (ANN) including a plurality ofconnected units or nodes called artificial neurons. The output signal ofan artificial neuron is calculated by a (non-linear) activation functionof the sum of its inputs signal(s). The connections between theartificial neurons typically have respective weights (gain factors forthe transferred output signal(s)) that are adjusted during one or morelearning phases. Other parameters of the NN that may or may not bemodified during learning may include parameters of the activationfunction of the artificial neurons such as a threshold. Typically, theartificial neurons are organized in layers. The most basic NNarchitecture, which is known as a “Multi-Layer Perceptron”, is asequence of so called fully connected layers. A layer consists ofmultiple distinct units (neurons) each computing a linear combination ofthe input followed by a nonlinear activation function. Different layers(of neurons) may perform different kinds of transformations on theirrespective inputs. Neural networks may be implemented in software,firmware, hardware, or any combination thereof. In the learningphase(s), a machine learning method, in particular a supervised,unsupervised or semi-supervised (deep) learning method may be used. Forexample, a deep learning technique, in particular a gradient descenttechnique such as backpropagation may be used for training of(feedforward) NNs having a layered architecture (deep neural networks).Modern computer hardware, e.g. GPUs makes backpropagation efficient formany-layered neural networks. A convolutional neural network (CNN) is afeed-forward artificial neural network that includes like most other NNsan input (neural network) layer, an output (neural network) layer, andone or more hidden (neural network) layers arranged between the inputlayer and the output layer. The speciality of CNNs is the usage ofconvolutional layers performing the mathematical operation of aconvolution of the input with a kernel. The hidden layers of a CNN mayinclude convolutional layers as well as optional pooling layers (fordown sampling the output of a previous layer before inputting it to thenext layer), fully connected layers and normalization layers. At leastone of the hidden layers of a CNN is a convolutional neural networklayer, in the following also referred to as convolutional layer. Theusage of convolutional layer(s) can help to compute recurring featuresin the input more efficiently than fully connected layers. Accordingly,memory footprint may be reduced and performance improved. Due to theshared-weights architecture and translation invariance characteristics,CNNs are also known as shift invariant or space invariant artificialneural networks (SIANNs).

Thereafter, the enriched primary content D_(i), W_(ij) may be processedby a (concept) recognition module 120 typically having a linguisticanalysis module 121 and a concept enrichment module 122 to identifyexplicit concepts and implicit concepts {c_(ik)} in content of theprimary data streams D_(i).

In particular, the linguistic analysis module may determine/furtherenrich the primary content with a respective normalized set of keywordsby linguistic processing, i.e. keywords which are normalized to its baseform in a linguistic sense. This typically includes normalisation to thecorresponding lemma or canonical form like e.g. the lemma “run” forwords like “runs”, “ran” or “running”. Normalisation also includesmapping of phrases including different word types like “runningexperience” to sentences like “With the new running shoe I experiencedthat jogging can actually be a lot of fun”.

The linguistic processing typically includes determining morphologicalvariations of words in individual languages.

The linguistic processing is typically a pre-processing of the conceptenrichment module 122. It increases the efficiency of the conceptrecognition module 122, as it allows identifying implicit concepts whichare within the primary data (text), but are used in a syntacticallydifferent way. For example, the concept “product development” can beidentified from a phrase like “ . . . the ability to develop sustainableand ecological products”.

Typically, concepts embrace mental representations, abstract objects orabilities that make up the fundamental building blocks of thoughts andbeliefs. Within this specification the term “concept” is used in asemantic sense and intends to describe an aggregation of terms and/orwords including synonyms and abbreviations. In other words, a “concept”typically includes several semantically similar words and terms,respectively, which are used by humans to express any kind of thought,fact or cognition in natural language.

Within this specification the term “term” is intended to describe a wordthat has meaning (semantics) and most often refers to objects, ideas,events or a state of affair.

Concepts can either be abstract terms like “virus” or “productdevelopment” but also named entities referring to physical objects likepersons (“Alfred Einstein”), organisations (“United Nations”), places(“Rome” or “Asia” or “pacific ocean”) or products (“iPhone” or “skimmilk”) or any other kind and class of physical entities that can bereferenced by its respective name.

Using concepts (aggregated terms and words) instead of terms and wordssubstantially improves performance as the huge number of terms and wordsused in languages can be mapped to a reduced number of concepts.

The use of linguistic processing strongly increases the recall ofassigning relevant concepts to content without requiring these implicitconcepts to appear inside the content in exact the same way as explicitconcepts. Implicit concepts may be morphologically different and/ordistributed within a sentence or paragraph as word(s)/terms/phrasesdescribing the concept.

Thereafter, resulting data may be further processed by the conceptenrichment module 122. As illustrated in FIG. 1, the concept enrichmentmodule 122 has access to 2^(nd) database 60 (concept database), forexample via an interface or storage module 150.

The concept database 60 is used to identify concepts c_(ik) in theprimary content, regardless of its explicit mentioning. E.g. the concept“virus (technology)” can be identified in content with use of words like“spyware” or “malware”, without the term “virus” explicitly beingmentioned in the text. The enriched concepts can therefore either beexplicitly mentioned or implicitly, like in the above example. The useof implicit concepts strongly increases the recall of the system toassign relevant concepts to content without requiring the worddescribing the concept to explicitly appear in the text.

Furthermore, the concept enrichment module 122 may assign an individual(first) secondary weight w_(ik) to each implicit or explicit conceptc_(ik). The secondary weights w_(ik) are typically calculated based onthe characteristic embeddings of the respective concept c_(ik) in thesegment S_(j) the content is classified to by the classification module110. This is explained in more detail below with regard to FIG. 3.

For example, consider a primary content D_(i) about a rock star and hislatest concert. In a side note it is stated that the concert startedhalf an hour later because of a traffic jam. The concept “traffic jam”will not carry a high (secondary) weight in the main segment category“rock music”. Indeed, the concept “traffic jam” it is not relevant forthe understanding of the main text of the primary content and is verylikely of no particular interest for users, who are interested in rockmusic and/or the particular rock star.

Thereafter, related concepts {r_(ikm)} and corresponding (second)secondary weights {w_(ikm)} characterizing embeddings of the relatedconcepts {r_(ikm)} in at least one of the predefined segment categoriesS_(j) may be determined by a concept expansion module 130.

Typically, each concept c_(ik) is expanded/supplemented by a respectiveset of related concepts {r_(ikm)}, either within the same (main) segmentcategory or any other segment category S_(j). Note that individualconcepts are scored by weights based on its significance with that verysegment.

Accordingly, events, situations, places or any other items in thephysical world may be identified, that are in the environment or relatedto identified concept c_(ik).

For example, the concept “virus” might be related to a concept like“hacking” or “dating platform” in the context and segment category,respectively, of “technology” or to a concept like “chicken” or “blood”in the segment categories “food”, “travel” and “health”, respectively.

Likewise, the concepts of the above mentioned example of primary contentreferring to the rock concert may be expanded by related concepts{r_(ikm)} like “drinking” or “alcohol” in the segment category “events &attractions”.

Thereafter, a sentiment and/or emotion analysis module 140, which istypically based on another trained neural network, in particular anothertrained CNN, is used to determine sentiments s_(i) and/or emotions e_(i)in the primary content D_(i).

Emotions may be defined as emotions like anger, fear or love, which canbe extracted from content. For example, the use of many “!!!” incombination with words like “finally” or “shit” can be assigned by a CNNto belong to the emotion class “anger”. By this procedure the processcan identify emotions that are related to certain concepts, events orsituations in the text (e.g. the emotion “surprise” in a text about apolitical election) or are direct utterances and emotions of the authorof the content (e.g. in comments or user-generated content in socialmedia).

For simplicity reasons the emotion analysis can also be restricted to asentiment analysis, which scores the content to be positive or negativeon a continuous scale between +1 and −1.

As a result the full set of data {i, W_(ij), c_(ik), w_(ik), r_(ikm),w_(ikm), s_(i), e_(i)} including explicitly and implicit concepts andsentiments including its weighted assignment to segment categories andscored relationships may be stored in the first database 70 (as asemantic knowledge graph).

The semantic knowledge graph typically represents an actual (dependingon the periodicity of the full process cycle) representation of what isrelevant in different segments categories and how things areinterrelated.

Typically, the semantic knowledge graph is updated or newly generatedfrom time to time, for example periodically.

In effect this allows representing time-dependent and trend-awareknowledge about the relevance and sentiment/emotion of concepts andtheir respective surroundings in different segment categories.

The SKG may include more content/data than required for analysing targetcontent of webpages and content of advertising campaigns, as it istypically built to represent as much knowledge as possible.

FIG. 2 illustrates in a block diagram a system 200 and processes formaintaining the second database 60 used for generating the SKN. Thesystem and server 200, respectively, may be a sub-system of the systems500, 500′ explained below with regard to FIGS. 5 and 6.

Similar as explained above with regard to FIG. 1, respective content 10may be obtained using a web-crawler, parsed, and URLs and the primarycontent D_(i) stored in the content index database 50 and/or retrievedfrom the database 50.

Typically, the system 200 is provided by and/or includes a server 200.

The primary content D_(i) may be fed via an interface or storage module250 to a classification module 210 typically implementing the same orsimilar processes as the classification module 110 explained above withregard to FIG. 1.

In particular, primary weights {W_(ij)} may be determined by module 210for the primary data streams D_(i). Each primary weight W_(ij)corresponds to a correlation between one of the primary data D_(i) andone segment category S_(j)of the predefined segment categories {S_(j)}.

Thereafter, a concept learning module 220 may be used to computespecific data for the existing concepts in all categories {S_(j)},typically by calculating term and/or word embeddings for all theconcepts {c_(ik)} which are specific to the respective segment S_(j)using a term embedding module 221.

In addition, a linguistic module 222 may compute the base forms ofdifferent appearances of concepts in its full forms (e.g. viruses,plural and also “viral infection” normalized to “virus”).

As a result embedding terms {c_(jk)} and their weights {w_(jkl)}characterizing the embeddings of the embedding terms in the respectivesegment categories (S_(j)*) are determined by the (concept) learningmodule 220 and stored in the concept database 60. This is explained insome more detail below with regard to FIG. 3.

Note that the above described processes of building the SKG isfacilitated by a fine-granular understanding of concepts, regardless ofhow exactly they are described by certain words within a text.

The term embedding module 221 of learning module 220 is typicallyimplemented as a deep learning module based on a neural network whichcalculates the specific term embeddings of a term in a certain segment(see FIG. 1b ). E.g. the embeddings (embedding terms) for the term(word) virus would be “inflammation”, “infection”, “HIV”, “bacteria”derived from content within the segment of “health” or “ransomware”,“trojan”, “spyware” in the segment category of “internet”

The term embedding module 221 is effective to interpret thecontext-specific meaning of a certain word (terms) and in particular todisambiguate between the different meanings of homonyms.

The subsequently used linguistic analysis module 222 normalizes thenames of identified concepts to a base form. This strongly increases theproductivity in a process where the concepts are directly displayed, asthe system automatically merge different variations of the names of thevery same concept to one normalized base form. For example, the pluralmentioning of “viruses” is always reduced to the singular form “virus”.

The normalized base form is stored in the concept database 60 togetherwith its explicitly mentioned full forms. In other words, concepts aretypically stored in its base form in the concept database 60.

FIG. 3 schematically illustrates how terms are embedded by other terms(embedding terms) with a certain weight and processes for determiningthe embedding terms (embeddings).

In the exemplary embodiment, six concepts c₁ to c₆ are shown includingtheir segment category S₁ to S₄. The term virus represents the exemplaryconcept c₅. The concept c₅ has therefore different meanings in thesegment categories S₃ (health) and S₄ (technology).

For example, the word virus could either be the cause of an infection ofhumans concept in the segment category S₃ or an infection of a computerin the segment category S₄.

The different meanings may be determined from embedding terms. Embeddingterms are words (terms) which are typically synonyms and/or frequentlyused in a similar way and with similar meaning.

In the exemplary embodiment, eight embedding terms t_(5i) likeinflammation, infection, HIV, bacteria of the concept (term) c₅ “virus”are shown.

The term embedding module 221 is configured to determine how a term isembedded by other terms with a respective weight that reconstructs thelinguistic contexts of the terms. The weight correspond to a certainprobability of other words to appear in the same linguistic contexts. Asa result these terms usually are either synonyms, abbreviations or sub-or super-ordinate terms (specialisations or generalisations) of therespective term appearing in the primary content D.

These embeddings may be calculated individually from the content D_(i)within all the different segment categories S_(j).

Typically, a word embedding algorithm such as such as Googles word2vecalgorithm is used for determining the weights characterizing theembeddings of the embedding terms t_(5i) with respect to (primary) termstaken as a basis for concepts.

In effect this allows disambiguating the specific meaning of a word/termin different contexts (segment categories). Accordingly, well-definedconcepts may be determined from words/terms.

According to an embodiment of computer-implemented method fordetermining concepts, the method includes determining primary weights({W_(ij)}) for a plurality of primary content, each of which istypically derived from a respective primary data (stream). Each primaryweight refers to a correlation between one primary content and onesegment category (S_(j)) of several predefined segment categories.Primary terms are identified in each primary content. The primary termsare typically normalized to their respective base form. Embedding termsand their weights with respect to the primary terms within thepredefined segment categories are determined. Concepts are determined.Each concept is based on one of the primary terms, the embedding termsfor the primary term and the respective weights. Typically, the conceptsare stored in a concept database.

FIG. 4 illustrates in a block diagram a system 300 and processes forcreating and/or maintaining a campaign database 80.

Typically, the system 300 is provided by and/or includes a server 300.In the following, the system 300 is also referred to as campaignanalyzing system. The system and server 300, respectively, may be asub-system of the systems 500, 500′ explained below with regard to FIGS.5 and 6.

Advertising campaigns Ad_(j) are sent to the server 300 and registeredin a campaign management module 310, for example from a server 20 viathe network and/or an interface or storage module 350. In the following,the campaign management module 310 is also referred to as managementmodule.

The management module typically request additional information {AD_(j)}regarding the advertised brand, service or product from the network suchas product descriptions 30, user comments and the like.

The content of the respective campaign Ad_(j) and the received contentof the respective additional information (also referred to as additionalcontent) {AD_(j)}, e.g. describing the product (like landing page ofcampaign) is may be send to a semantic analysis module 320.

The semantic analysis module 320 may be used to identify relevantconcepts c_(Aji) for the ads Ad_(j), which are typically retrieved fromthe concept database 60 storing weights {w_(ikl)} for concepts {c_(ik)}within predefined segment categories {S₁}.

The semantic campaign analysis module 320 typically analyses the contentof the ads Ad_(j) and enriches it with semantic meta-data.

Further, the semantic campaign analysis module 230 may be effectivelyidentical to the linguistics analysis and concept enrichment module 120explained above with regard to FIG. 1. That means that is alsoconfigured to identify the explicit and implicit concepts of the adsAd_(j).

For example, for an ad about “espresso” the semantic campaign analysismodule 220 may assign concepts like “espresso” but also the implicitlyfound concept “coffee” as retrieved from the concept database 60.

Thereafter, a semantic expansion module 330, which retrieves its datafrom the first database 70, may be used for adding surrounding topics.

The semantic expansion module 330 typically analyses content of the adsAd_(j) and expands it with other concepts as stored and retrieved fromthe first database 70.

In the above example of the ad referring to “espresso”, related conceptslike “breakfast” (in the segment category of “food”) or “Italy” (in thesegment category of “travel”) may be determined.

More specifically, the semantic analysis module 320 may determineprimary weighted semantic metadata {c_(Aji), w_(ji)} for the advertisingcampaigns Ad_(j) using the second database 60 storing weights {w_(ikl)}for concepts c_(ik) within predefined segment categories S₁; and thesemantic expansion module 330 may determine secondary weighted semanticmetadata {c_(Aji′), w_(ji′)} for the advertising campaigns Ad_(j) (andreferring to surrounding topics) using the first database 70 whichstores first concepts {c_(ik)}, first weights {w_(ik)} for the firstconcepts in predefined segment categories ({S_(j)}), second concepts({r_(ikm)}) that are related to the first concepts, and the secondweights ({w_(ikm)}) for the second concepts (related concepts) inpredefined segment categories ({S_(j)}), and ratings {s_(i), e_(i)} ofthe advertising campaigns Ad_(j).

The thus expanded concepts (weighted metadata) are stored in thecampaign database 80, typically together with the primary campaign dataAd_(j).

Optionally, the concepts and weighted metadata in the campaign databasefor a given campaign Ad_(j) can be edited and modified by a campaignmanager. This may include:

Manually adding other concepts or expanded concepts, e.g. “tired” and“coffein” (in the context of “personal health”), and/or

Changing the automatically computed sentiment s_(i) of a certain relatedconcept for the given campaign (e.g. setting the sentiment of theconcept “plane”, which has a trend-aware negative sentiment because of arecent plane crash, to a positive sentiment, because the campaign isfrom a railway company that competes with air travel.)

FIG. 5 is a block diagram illustrating a system 500 and processes forfacilitating online advertising in network with improved privacy.

A client device 1 may retrieve on a request of a user an URL in thenetwork. For example, the user may navigate in the WWW and retrieve theURL using a browser displayed on a display of the client device 1. Thismay be detected by an Ad server 400. The Ad server 400 may declare thewebpage, referenced by this URL, as a target content page 10′, which issubsequently forwarded to a matching server 501.

A (campaign) management module 510 of the server 501 may compare the URLof the target content page 10′ (target URL) with all URLs stored in thecontent index database 50.

If the target URL is not yet stored in the index, the URL may beprocessed by a server 100 as explained above with regard to FIG. 1and/or a server 300′ operating similar to the server 300 as explainedabove with regard to FIG. 4, but analysing content of target pages(target content) D′_(i) instead of ads and storing the determinedweighted metadata for the target content in a target database 90.

Note that the semantic expansion processes of determining the secondconcepts that are related to the first concepts as explained above withregard to the module 320 (see e.g. paragraph [00116]) is typicallyomitted during analysing the target content D′_(i). Alternatively, thesesemantic expansion processes may be omitted during analysing the adsAd_(j). However, the latter may be less performant. Furthermore, thecontent of the target pages is typically only analysed by the server300′ on request to save data space. In other words, the target database90 may be omitted.

On request of the campaign management module 510, a matching module 520determines for the target content D′_(i) a matching parameter p_(ij)with advertising campaigns Ad_(j) using weighted semantic targetmetadata for the target content 10′ provided by the target data base 90or the server 300′, and weighted semantic target metadata for theadvertising campaigns Ad_(j) provided by the campaign database 80.

The weighted semantic target metadata for the target content 10′ mayinclude weights w′_(im) for the concepts c_(im)′ of the target content10′, and the weighted semantic campaign metadata may include weightsw_(jk) for the concepts c_(jk) of the advertising campaigns Ad_(j).

The pairwise matching parameter p_(ij) is typically determined as afunction of the weights w′_(im) and w_(jk) of common conceptsc_(im)′=c_(jk), for example a function of the products w′_(im)*w_(jk) ofthe common concepts c_(im)′=c_(jk).

In one embodiment, the pairwise matching parameter p_(ij) is determinedas sum of the products w′_(im)*w_(jk) of the common conceptsc_(im)′=c_(jk) :

p _(ij)=Sum(w _(jk) *w′ _(im) , ∀ c _(im) =c _(jk))   (1)

A higher matching parameter p_(ij) indicates a better concept matchbetween the target content 10′ and the ads Ad_(j). Accordingly, thematching parameter p_(ij) may be used for ranking the ads Ad_(j) withrespect to the target content 10′.

Typically, the matching module 520 also provides any sentiments s and/oremotions e stored for the advertising campaigns Ad_(j) in the campaigndatabase 80 and/or any sentiments s and/or emotions e of the targetcontent 10′ to the campaign management module 510.

The sentiments s and/or emotions e may also be used for ranking the adsAd_(j).

Typically, emotions e are measured on a scale with different valueswhile sentiments binary values such as +1, −1 representing positiveemotion and negative emotions, respectively.

Typically, the matching parameter p_(ij) is increased if the emotions orsentiments of the target content 10′ and the advertising campaignsAd_(j) are both positive, if only one positive emotion or sentiment isfound, but also if the emotions or sentiments of the target content 10′and the advertising campaigns Ad_(j) are both negative. Likewise, thematching parameter p_(ij) is typically decreased if only one negativeemotion or sentiment is found. This may however be overruled (see theexample for negative sentiment below). If no emotions and sentiments arefound, the matching parameter p_(ij) may remain unchanged.

For example, the matching parameter p_(ij) may be multiplied with theproduct of found sentiment values s_(T) of the target content and s_(A)of the target ad Ad_(j).

p _(ij) =s′ _(T) *s′ _(A)*Sum(w _(jk) *w′ _(im) , ∀ c _(im) ′=c _(jk))  (2)

In eq. 2, s′_(T) and s′_(A) are equal to 1 if the respective sentiments_(T) and s_(A) is not found and equal to the respective values of s_(T)and s_(A) otherwise.

As a result a ranked or scored list 40 ({Ad_(j)*}) of the ads Ad_(j) maybe provided.

The Ad server 400 may then deliver the ad with the highest rank or score(or another one depending on other rulesets defined in the ad server) tothe client.

In particular, the IDs of the ads with the highest matching scores aresent to an ad server. The ad server delivers the target ad with thehighest matching score (or another one, if other rules for ad targetingare applied and overrule this matching rule) to the target content page,the user is actually consuming.

Alternatively, for campaign planning, all URLs with the highest scoreare sent to an ad tech system to place the ads preferably on those webURLs (white list) and all URLs with high negative scores preferably noton these URLs (blacklist).

In the following, the examples for using sentiments and/or emotions aregiven.

In one example, an advertising campaign of a railway company with anenriched emotion of sadness in relation to the concept of “plane” or“air travel” is considered to be positive with respect to a target URLwith the enriched concept “plane”.

The matching function may support two different purposes, the positivecorrelation between content and ad (for best fitting the message of thecommercial ad to the consumed content) and the negative correlation (foravoiding showing messages within a content context that puts the brandinto a negative or compromising context).

EXAMPLES For Positive Fitting

The matching functions identifies the expanded concept “Italy” assignedto an ad about “espresso” by the semantic expansion module andcalculates a positive match with an article that has a high weight withthe explicit concept “Rome” and the implicit concept “Italy”.

The cognitive matching module calculates a continuous (relevance) scoreof the target URL for every ad registered as active in the campaignmanagement module.

For avoiding bad placements of ads, placements which have a negativescore are to be avoided.

Example for avoiding negatively connotated placements:

The matching functions identifies the concept “alcohol” assigned to anad about “whiskey” by the semantic expansion module and calculates anegative match with an article that has a high weight with the explicitconcepts “alcohol” and “health” but has a strong negative sentiment,because it describes the bad consequences of alcohol to personal health.

However, there are cases where a negative sentiment of the targetcontent may not lead to exclusion depending on the campaign.

Example for deliberately placing an ad within a negative sentiment:

The target content is about a mass pileup on a highway and the ad refersto a “Relaxed travel” campaign of a railway company. While the targetcontent would be rated negative for an ad of a “car manufacturer” itcould consciously be evaluated as a positive match for the railwaycompany.

FIG. 6 is a block diagram illustrating a system 500′ for facilitatingonline advertising in a network with improved privacy. The system 501 istypically similar to the system 500 explained above with regard to FIG.5. In particular, the system 501 typically includes as subsystem aserver 501 as explained above with regard to FIG. 5 for performing thematching in block 5.

Furthermore, the system 500′ typically includes as respective subsystemsa server 100 as explained above with regard to FIG. 1 for generating(and maintaining) the semantic knowledge graph and the first database70, respectively, as indicated by block 1, a server 200 for generating(and maintaining) the concept database 60 as indicated by block 2, and aserver 300 for analysing the content of target (target analyzing system)and ads (campaign analyzing system) as indicated by block 3.

In particular, the system 500′ may include:

a matching module configured to use weighted semantic target metadatafor a target content (10′) provided in the network and weighted semanticcampaign metadata for an advertising campaign (Ad_(j)) to be presentedin the network and referring to a respective product and/or a service todetermine a matching parameter (p_(ij)) between the target content andthe advertising campaign; and

a management module (510) configured to use the matching parameter(p_(ij)) to decide if the advertising campaign (Ad_(j)) is to beprovided to the target content (10′).

Typically, the weighted semantic target metadata for the target contentcomprise weights for the concepts of the target content.

The weighted semantic campaign metadata may comprise weights for theconcepts of the advertising campaign (Ad_(j)).

The matching module (520) is typically configured to determine thematching parameter p_(ij) as a function of the weights for the conceptsof the target content and the weights for the concepts of theadvertising campaign.

Typically, the function depends on the products of the weights of commonconcepts of the target content and the advertising campaign (see eq. 1above).

The system 500′ typically includes a campaign database (80) storingsemantic metadata for advertising campaigns (ads), and the matchingmodule (520) has access to the campaign database (80).

Alternatively or in addition, the matching module may have access to acampaign analyzing system (300) which is configured to determinesemantic metadata ({c_(Ai), w_(ik) . . . }) for advertising campaignsand/or to store the semantic metadata ({c_(Ai), w_(ik) . . . }) in acampaign database (80).

The campaign analyzing system typically includes:

a semantic analysis module (320) that is, when executed by the at leastone of the one or more processors, configured to determine primaryweighted semantic metadata for the advertising campaigns (Ad_(j)) usingthe campaign database (60) typically storing weights ({w_(ikl)}) forconcepts (c_(ik)) within predefined segment categories (S₁); and

a semantic expansion module (330) configured to determine secondaryweighted semantic metadata for the advertising campaigns (Ad_(j)) usingthe first database (70) typically storing first concepts {c_(ik)}, firstweights {w_(ik)} for the first concepts in predefined segment categories{S_(j)}, second concepts {r_(ikm)} that are related to the firstconcepts, and second weights ({w_(ikm)}) for the second concepts in thepredefined segment categories {S_(j)}, and optionally identified ratingsof the advertising campaigns (Ad_(j)).

The semantic campaign analysis module (320) and/or the semanticexpansion module (320) may be configured to use additional content{AD_(j)} retrieved from the network for determining the respectiveweighted semantic metadata.

Further, the system 500′ may include a target database (90) storingsemantic metadata for target content (10, 10′).

Alternatively or in addition, the system 500′ may include a targetanalyzing system including:

a semantic analysis module configured to determine primary weightedsemantic metadata for target content using the concept database (60);

a semantic expansion module configured to determine secondary weightedsemantic metadata for the target content using the first database (70)storing first concepts ({c_(ik)}), first weights ({w_(ik)}) for thefirst concepts in predefined segment categories ({S_(j)}), andoptionally identified ratings ({s_(i)}* of the target content; and

a storage module (350) configured to store the respective weightedsemantic metadata and optionally the identified ratings ({s_(i)}) of thetarget content in a target database (90).

The system 500′ may include one or more servers (100, 200, 300, 400,501) connected to the network and configured to initiate sending via thenetwork an advertising campaign with a matching parameter {p_(ij)} abovea predefined threshold to a client (1) having a display displaying thetarget content (10′) and the advertising campaign.

The system 500′ may be configured to initiate sending the advertisingcampaign at least substantially based on the matching parameter{p_(ij)}, at least substantially based on the matching parameter{p_(ij)}, and identified ratings ({s_(i)}) of the target content and/orthe advertising campaign, and/or wherein the system is configured toinitiate sending the advertising campaign without taking into accounttracking data of registered users of the client (1).

In the exemplary embodiment of FIG. 6, the server 501 may additionallybe configured to perform additional blocks or modules 4 referring to acampaign qualification and 6 referring to an impact analysis.

In a nutshell, the campaign qualification and the impact analysis may beimplemented as variants of the matching explained above. The campaignqualification takes place in a campaign planning process. It makes aprediction over the whole addressable target content and makesrecommendations where ads should be placed and where not. In combinationwith additional data about the number of users visiting the respectedURLs the system can deliver a prediction how many qualified placementscan be made. This information can also be utilized in programmatic adbidding processes, to optimize the biddings to focus on those URLs whichwill have the best impact, less risk for bad placements and still reachthe requested number of users. The Impact Analysis takes place in acontinuous process during and after ad placement. It evaluates theactual click behaviour of the user (without tracking personal data) andchecks whether the user actually clicks on a recommended ad. From thisit can then be concluded how well the advertising message is received bythe user, especially in connection with the concepts and emotions. Theinformation can be used for adapting the weights and/or the matchingfunctions. In addition this information is of interest to the advertiserin that it makes explicit in which emotions and topical surroundingsusers are open to a brand message.

FIG. 7A illustrates a flow chart of a method 1000 for creating, updatingand/or maintaining the first database 80. Method 1000 may be used by theserver 100 explained above with regard to FIG. 1.

After receiving a primary content D_(i), primary weights {W_(ij)}referring to a respective correlation between the primary content D_(i)and one segment category of several predefined segment categories S_(j)are determined in a block 1100.

Further, a sentiment and/or an emotion {s_(i), e_(i)} may be determinedfor the primary content D_(i) in a block 1400.

In a subsequent block 1200, explicit concepts and implicit concepts{c_(ik)} may be identified in the primary content (D_(i)), and todetermine first secondary weights ({w_(ik)}) characterizing embeddingsof the identified concepts ({c_(ik)}) in the respective main segmentcategory (S_(j)) with highest primary weight (Max{w_(ij)}, j)) using aconcept database (60) storing weights ({w_(ikl)}) for concepts (c_(ik))within the predefined segment categories (S₁).

Thereafter, respective related concepts ({r_(ikm)}) and second secondaryweights ({w_(ikm)}) of the related concepts ({r_(ikm)}) characterizingembeddings of the related concepts ({r_(ikm)}) in predefined segmentcategories (S_(j)) may be determined for the identified concepts({c_(ik)}).

The identified concepts ({c_(ik)}), the first secondary weights({w_(ik)}), the related concepts ({r_(ikm)}), the second secondaryweights ({w_(ikm)}), and any sentiment and/or an emotion may be storedin the first database (70), in a block 1500.

As indicated by the dashed arrow in FIG. 7A, the index i may beincremented and a further a primary content D_(i) processed.

FIG. 7B illustrates a flow chart of a method 3000 for for facilitatingonline advertising with improved privacy in a network. Method 3000 maybe used by the server 501 explained above with regard to FIG. 5.

In blocks 3100, 3200 respective weighted semantic target metadata, areprovided for an advertising campaign Ad_(j) and a target content D′_(i).

The weighted semantic target metadata include respective conceptsc_(jk), c′_(im) and corresponding weights (typically real numbers)w_(jk), w′_(im), of the concepts c_(jk), c′_(im).

In a block 3400, the weighted semantic target metadata {c′_(im), w′_(im). . . } of the target content D′_(i) and the weighted semantic targetmetadata {c_(jk), w_(jk) . . . } are used to calculate a matchingparameter p_(ij) between the target content D′_(i) and the advertisingcampaign Ad_(j) as explained above with regard to FIG. 5.

As indicated by the dashed arrow in FIG. 7B, the index j may beincremented and a further matching parameters p_(ij) between the targetcontent D′_(i) and a further advertising campaign Ad_(j) calculated.

In a block 3500, the matching parameters p_(ij) are used for ranging theadvertising campaigns Ad_(j) with respect to the target content D′_(i).

Finally, the highest ranked advertising campaign(s) Ad_(j) may beprovided to a client displaying the target content D′_(i) in a block3600.

According to an embodiment of a system for creating and/or maintaining adatabase, the system includes a classification module configured todetermine primary weights for a primary data (stream), each primaryweight referring to a correlation between a content of one of theprimary data and one segment category of several predefined segmentcategories, and to determine a segment category with highest primaryweight as a main segment category; a recognition module configured toidentify explicit concepts and implicit concepts in the content of theprimary data streams, and to determine first secondary weightscharacterizing embeddings of the identified explicit concepts andimplicit concepts in the main segment category using a concept databasestoring concepts and respective concept metadata including weights ofthe concepts within the predefined segment categories and/or relationsbetween concepts within the predefined segment categories, and anexpansion module configured to determine for the identified (explicitand implicit) concepts respective related (explicit and implicit)concepts and second secondary weights of the related concepts, thesecondary weights characterizing embeddings of the related concepts inat least one of the predefined segment categories.

The system typically includes an analysis module configured to identifyin the primary data streams a rating comprising at least one of asentiment and an emotion.

Further, the system typically includes a storage module configured tosave the identified concepts, the first secondary weights, the relatedconcepts, the second secondary weights, and optionally the identifiedrating in the first database.

According to an embodiment of a system for facilitating onlineadvertising with improved privacy in a network, the system includes amatching module configured to use weighted semantic target metadata fora target content provided in the network and weighted semantic campaignmetadata for an advertising campaign to be presented in the network andreferring to a respective product and/or a service to determine amatching parameter between the target content and the advertisingcampaign, and a management module configured to use the matchingparameter for ranking of advertising campaigns with respect to thetarget content and/or deciding if the advertising campaign is to beprovided to the target content.

According to an embodiment of computer-implemented method forfacilitating online advertising with improved privacy in a network, themethod includes semantically analyzing an advertising campaign to bepresented in the network and referring to a respective product and/or aservice to determine for the advertising campaign weighted semanticcampaign metadata including a concept of the advertising campaign and arespective weight of the concept in at least one segment category ofseveral predefined segment categories; semantically analyzing a targetcontent, in particular a webpage, provided in the network to determineweighted semantic target metadata comprising a concept of the targetcontent and a respective weight of the concept of the target content inat least one segment category of the several predefined segmentcategories; using the weighted semantic target metadata and the weightedsemantic campaign metadata to determine a matching parameter between thetarget content and the advertising campaign; and using the matchingparameter to decide if the target content is to be linked with theadvertising campaign and/or if the advertising campaign is to beprovided when the target content is retrieved on request of a clientconnected to the network.

Typically, at least one of the concept of the advertising campaign andthe concept of the target content is an implicit concept.

The systems, devices and methods explained herein do not requirerecording, storing and processing of any kind of personal data and aretherefore is compliant by design with privacy requirements such as theprinciples of the GDPR regulation and the like as well as theforthcoming e-Privacy regulation. In particular, user tracking can becompletely avoided.

Instead, various techniques of natural language processing includingCNNs and/or deep neural networks are used to predict those ads that willhave the most positive impact on a user's willingness to click on thead, depending on the context the ad is embedded in.

Context-sensitive algorithm as explained herein may be used to subtlymap out the different meanings of individual words in various contexts(like for instance “attack” in the context of sports or warfare) andrelated implications on the user's perception.

Further, the algorithm may be sensitive to the changing relevance oftopics and related words people assign to them and/or to the changingemotional implications and moods certain topics and words may cause tothe user consuming content and an embedded ad, depending on time andcontext. Note that sentiments are typically not static but may depend oncontext, time and/or segment category (field).

In particular, a distributed computing architecture that analyses thedifferent realms of information may be used. The first realm includesall content assets, like webpages, that are possibly the target where anad is to be embedded in. The second realm includes all informationregarding the campaign, including the ad description, the landing pageof the campaign and optionally additional information related to thecampaign or the advertised offering or company. The third realm includesall kind of publicly available information in the WWW like news, onlinemagazines, and blogs as well as social media and user forums.

The explained processes for facilitating online advertising withimproved privacy, as well as creating and/or maintaining databases canpractically not be performed by humans (human mind) but may outperformhumans with respect to quality of ad placement.

Although various exemplary embodiments of the invention have beendisclosed, it will be apparent to those skilled in the art that variouschanges and modifications can be made which will achieve some of theadvantages of the invention without departing from the spirit and scopeof the invention. It will be obvious to those reasonably skilled in theart that other components performing the same functions may be suitablysubstituted. It should be mentioned that features explained withreference to a specific figure may be combined with features of otherfigures, even in those cases in which this has not explicitly beenmentioned. Such modifications to the inventive concept are intended tobe covered by the appended claims.

While processes may be depicted in the figures in a particular order,this should not be understood as requiring, if not stated otherwise,that such operations have to be performed in the particular order shownor in sequential order to achieve the desirable results. In certaincircumstances, multitasking and/or parallel processing may beadvantageous.

Spatially relative terms such as “under”, “below”, “lower”, “over”,“upper” and the like are used for ease of description to explain thepositioning of one element relative to a second element. These terms areintended to encompass different orientations of the device in additionto different orientations than those depicted in the figures. Further,terms such as “first”, “second”, and the like, are also used to describevarious elements, regions, sections, etc. and are also not intended tobe limiting. Like terms refer to like elements throughout thedescription.

As used herein, the terms “having”, “containing”, “including”,“comprising” and the like are open ended terms that indicate thepresence of stated elements or features, but do not preclude additionalelements or features. The articles “a”, “an” and “the” are intended toinclude the plural as well as the singular, unless the context clearlyindicates otherwise.

With the above range of variations and applications in mind, it shouldbe understood that the present invention is not limited by the foregoingdescription, nor is it limited by the accompanying drawings. Instead,the present invention is limited only by the following claims and theirlegal equivalents.

1. A system for creating and/or maintaining a first database, the system comprising: one or more processors; a classification module that is, when executed by at least one of the one or more processors, configured to determine primary weights for primary data streams, each primary weight referring to a correlation between one of the primary data streams and one segment category of several predefined segment categories; a recognition module that is, when executed by at least one of the one or more processors, configured to identify explicit concepts and implicit concepts in the primary data streams, and to determine first secondary weights characterizing embeddings of the identified concepts in the respective main segment category with highest primary weight using a concept database storing weights for concepts within the predefined segment categories; an expansion module that is, when executed by at least one of the one or more processors, configured to determine for the identified concepts respective related concepts and second secondary weights of the related concepts characterizing embeddings of the related concepts in at least one of the predefined segment categories; and a storage module that is, when executed by at least one of the one or more processors, configured to save the identified concepts, the first secondary weights, the related concepts, and the second secondary weights in the first database.
 2. The system of claim 1, further comprising an analysis module that is, when executed by at least one of the one or more processors, configured to identify in the primary data streams a rating comprising at least one of a sentiment, and an emotion, wherein the storage module is configured to save the identified rating in the first database.
 3. The system of claim 1, wherein the recognition module comprises at least one of: a linguistic analysis module that is, when executed by at least one of the one or more processors, configured to determine for the primary data streams a respective normalized set of keywords; and a concept enrichment module that is, when executed by at least one of the one or more processors, configured to identify the explicit concepts, the implicit concepts, and the first and second secondary weights using the concept database.
 4. The system of claim 1, wherein the system is configured to host the first database and/or wherein the storage module is configured to save the identified concepts, the first secondary weights, the related concepts, the second secondary weights, and/or the identified rating in a semantic knowledge graph structure of the first database.
 5. The system of claim 1, wherein the classification module comprises a trained CNN.
 6. A system for maintaining a second database, the system comprising: one or more processors; a classification module that is, when executed by at least one of the one or more processors, configured to determine primary weights for primary data streams, each primary weight referring to a correlation between one of the primary data streams and one segment category of several predefined segment categories; a learning module that is, when executed by at least one of the one or more processors, configured to determine for known concepts comprising a respective term found in the primary data streams embedding terms for the respective term and weights characterizing the embeddings of the embedding terms in the respective segment categories; and a storage module configured to update the known concepts stored in the second database in accordance with the embedding terms and the weights.
 7. The system of claim 6, wherein the learning module comprises a deep learning module which is based on a neural network.
 8. The system of claim 6, wherein the concept learning module implements a word embedding algorithm for determining the weights characterizing the embeddings of the embedding terms.
 9. The system of claim 6, wherein the concept learning module comprises at least one of: an embedding module that is, when executed by at least one of the one or more processors, configured to determine the weights characterizing the embeddings of the embedding terms; and a linguistic analysis module that is, when executed by at least one of the one or more processors, configured to normalize the names of concepts and/or terms to a respective base form.
 10. A system for facilitating online advertising with improved privacy in a network, the system comprising: one or more processors; a matching module that is, when executed by at least one of the one or more processors, configured to use weighted semantic target metadata for a target content provided in the network and weighted semantic campaign metadata for an advertising campaign to be presented in the network and referring to a respective product and/or a service to determine a matching parameter between the target content and the advertising campaign; and a management module that is, when executed by at least one of the one or more processors, configured to use the matching parameter to decide if the advertising campaign is to be provided to the target content.
 11. The system of claim 10, wherein the weighted semantic target metadata for the target content comprise weights for the concepts of the target content, and wherein the weighted semantic campaign metadata comprise weights for the concepts of the advertising campaign, and, wherein the matching module is, when executed by the at least one of the one or more processors, configured to determine the matching parameter as a function of the weights for the concepts of the target content and the weights for the concepts of the advertising campaign.
 12. The system of claim 11, wherein the function depends on the products of the weights of common concepts of the target content and the advertising campaign.
 13. The system of claim 10, wherein the system comprises a campaign database storing semantic metadata for advertising campaigns, and wherein the matching module has, when executed by the at least one of the one or more processors, access to the campaign database.
 14. The system of claim 10, wherein the matching module has, when executed by the at least one of the one or more processors, access to a campaign analyzing system which is configured to determine semantic metadata for advertising campaigns and/or to store the semantic metadata in a campaign database.
 15. The system of claim 14, wherein the campaign analyzing system comprises: one or more processors; a semantic analysis module that is, when executed by the at least one of the one or more processors, configured to determine primary weighted semantic metadata for the advertising campaigns using a database storing weights for concepts within predefined segment categories; and a semantic expansion module, that is, when executed by the at least one of the one or more processors, configured to determine secondary weighted semantic metadata for the advertising campaigns using a database storing first concepts, first weights for the first concepts in predefined segment categories, second concepts that are related to the first concepts, and second weights for the second concepts in predefined segment categories, and optionally identified ratings of the advertising campaigns.
 16. The system of claim 15, wherein at least one of the semantic analysis module and the semantic expansion module, is, when executed by the at least one of the one or more processors, configured to use additional content retrieved from the network for determining the respective weighted semantic metadata.
 17. The system of claim 10, wherein the system comprises at least one of a target database storing semantic metadata for target content, the matching module having, when executed by the at least one of the one or more processors, access to the target database, and a target analyzing system.
 18. The system of claim 17, wherein the target analyzing system comprises at least one of: one or more processors; a semantic analysis module that is, when executed by the at least one of the one or more processors, configured to determine primary weighted semantic metadata for target content using a concept database storing weights for concepts within predefined segment categories; a semantic expansion module, that is, when executed by the at least one of the one or more processors, configured to determine secondary weighted semantic metadata for the target content using a database storing first concepts, first weights for the first concepts in predefined segment categories, and optionally identified ratings of the target content; and a storage module configured to store the respective weighted semantic metadata and optionally the identified ratings of the target content in a target database.
 19. The system of claim 10, wherein the system comprises at least one server which is connectable to the network and, in a connected state, configured to initiate sending via the network an advertising campaign to a client comprising a display displaying the target content, the advertising campaign comprising a matching parameter above a predefined threshold.
 20. The system of claim 19, wherein the system is configured to initiate sending the advertising campaign at least substantially based on the matching parameter, at least substantially based on the matching parameter, and identified ratings of the target content and/or the advertising campaign, and/or wherein the system is configured to initiate sending the advertising campaign without taking into account tracking data of registered users of the client. 